Exploiting idle cycles to execute data mining applications on clusters of PCs

نویسندگان

  • Hermes Senger
  • Eduardo R. Hruschka
  • Fabrício Alves Barbosa da Silva
  • Liria Matsumoto Sato
  • Calebe De Paula Bianchini
  • Bruno F. Jerosch
چکیده

In this paper we present and evaluate Inhambu, a distributed object-oriented system that supports the execution of data mining applications on clusters of PCs and workstations. This system provides a resource management layer, built on the top of Java/RMI, that supports the execution of the data mining tool called Weka. We evaluate the performance of Inhambu by means of several experiments in homogeneous, heterogeneous and non-dedicated clusters. The obtained results are compared with those achieved by a similar system named Weka-Parallel. Inhambu outperforms its counterpart for coarse grain applications, mainly for heterogeneous and non-dedicated clusters. Also, our system provides additional advantages such as application checkpointing, support for dynamic aggregation of hosts to the cluster, automatic restarting of failed tasks, and a more effective usage of the cluster. Therefore, Inhambu is a promising tool for efficiently executing real-world data mining applications. The software is delivered at the project’s web site available at http://incubadora.fapesp.br/projects/inhambu/. 2006 Elsevier Inc. All rights reserved.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Inhambu: Data Mining Using Idle Cycles in Clusters of PCs

In this paper we present and evaluate Inhambu, a distributed objectoriented system that relies on dynamic monitoring to collect information about the availability of computational resources, providing the necessary support for the execution of data mining applications on clusters of PCs and workstations. We also describe a modified implementation of the data mining tool Weka, which executes the...

متن کامل

Volunteer Computing on Clusters

Clusters typically represent a homogeneous, well maintained pool of high-end computation resources. This makes them particularly attractive for volunteer computing, where unused compute cycles are utilized for scientific guest applications. Cluster nodes are not idle as often as public PCs, but they are frequently underutilized while actively executing parallel applications. Hence, fully exploi...

متن کامل

An Analysis of Idle CPU Cycles at University Computer Labs

Grid computing has a great potential for grand challenge scientific problems such as Molecular Simulation, High Energy Physics and Genome Informatics. Exploiting under-utilized resources is crucial for a cost-effective, large-scale grid computing platform (i.e., computational grid), but there has been little research work on how to predict what resources will be under-loaded in the near future....

متن کامل

Non-Dedicated Distributed Environment: A Solution for Safe and Continuous Exploitation of Idle Cycles

The Non-Dedicated Distributed Environment (NDDE) aims to muster the idle processing power of interactive computers (workstations or PCs) into a virtual resource for parallel applications and grid computing. NDDE is novel in the sense that it allows for safe and continuous use of idle cycles. Differently from existing solutions, NDDE applications run inside a virtual machine rather than on the u...

متن کامل

An Idle Compute Cycle Prediction Service for Computational Grids

The utilization of idle compute cycles has been known as most promising and cost-effective way to build a large scale high performance computing system, but not widely used because of the lack of effective idleness prediction techniques. In this paper, we argue PCs at university computer labs have a great potential for the utilization of idle CPU cycles, and propose two techniques for predictin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Systems and Software

دوره 80  شماره 

صفحات  -

تاریخ انتشار 2007